236 research outputs found

    Introducing a Data Sliding Mechanism for Cooperative Caching in Manycore Architectures

    Get PDF
    International audienceIn this paper, we propose a new cooperative caching method improving the cache miss rate for manycore micro- architec- tures. The work is motivated by some limitations of recent adaptive cooperative caching proposals. Elastic Cooperative caching (ECC), is a dynamic memory partitioning mechanism that allows sharing cache across cooperative nodes according to the application behavior. However, it is mainly limited with cache eviction rate in case of highly stressed neighbor- hood. Another system, the adaptive Set-Granular Cooperative Caching (ASCC), is based on finer set-based mechanisms for a better adaptability. However, heavy localized cache loads are not efficiently managed. In such a context, we propose a cooperative caching strategy that consists in sliding data through closer neighbors. When a cache receives a storing request of a neighbor's private block, it spills the least recently used private data to a close neighbor. Thus, solicited saturated nodes slide local blocks to their respective neighbors to always provide free cache space. We also propose a new Priority- based Data Replacement policy to decide efficiently which blocks should be spilled, and a new mechanism to choose host destination called Best Neighbor selector. The first analytic performance evaluation shows that the proposed cache management policies reduce by half the average global communication rate. As frequent accesses are focused in the neighboring zones, it efficiently improves on-Chip traffic. Finally, our evaluation shows that cache miss rate is en- hanced: each tile keeps the most frequently accessed data 1- Hop close to it, instead of ejecting them Off-Chip. Proposed techniques notably reduce the cache miss rate in case of high solicitation of the cooperative zone, as it is shown in the performed experiments

    Asymmetric Cache Coherency: Policy Modifications to Improve Multicore Performance

    No full text
    International audienceAsymmetric coherency is a new optimisation method for coherency policies to support non-uniform work- loads in multicore processors. Asymmetric coherency assists in load balancing a workload and this is applica- ble to SoC multicores where the applications are not evenly spread among the processors and customization of the coherency is possible. Asymmetric coherency is a policy change, and consequently our designs re- quire little or no additional hardware over an existing system. We explore two different types of asymmetric coherency policies. Our bus based asymmetric coherency policy, generated a 60% coherency cost reduction (reduction of latencies due to coherency messages) for non-shared data. Our directory based asymmetric co- herency policy, showed up to a 5.8% execution time improvement and up to a 22% improvement in average memory latency for the parallel benchmarks Sha, using a statically allocated asymmetry. Dynamically allo- cated asymmetry was found to generate further improvements in access latency, increasing the effectiveness of asymmetric coherency by up to 73.8% when compared to the static asymmetric solution

    HardBlare: an efficient hardware-assisted DIFC for non-modified embedded processors

    Get PDF
    International audienceInformation Flow Control is a security mechanisms that provides security guarantees about information propagation. Other security mechanisms such as access control or cryptography can be used to limit the dissemination of confidential information and the modification of high integrity contents. However, they do not enforce end-to-end properties. They cannot control the dissemination of information once file access is allowed or the data is decrypted. In this context, HardBlare proposes a software/hardware codesign methodology to ensure that security properties are preserved all allong the execution of the system but also during files storage. The general context of HardBlare is to address Dynamic Information Flow Control (DIFC) that generally consists in attaching marks (also known as tags) to denote the type of information that are saved or generated within the system

    Using the Spring Physical Model to Extend a Cooperative Caching Protocol for Many-Core Processors

    Get PDF
    International audienceAs the number of embedded cores grows up, the off-chip memory wall becomes an overwhelming bottleneck. As a consequence, it is more and more prevalent to efficiently exploit on-chip data storage. In a previous work, we proposed a data sliding mechanism that allows to store data onto our closest neighborhood, even under heavy stress loads. However, each cache block is allowed to migrate only one time to a neighbor's cache (e.g. 1-Chance Forwarding). In this paper, we propose an extension of our mechanism in order to expand the cooperative caching area. Our work is based on an adaptive physical model, where each cache block is considered as a mass connected to a spring. This technique constrains data migration according to the spring constant and the difference of work-loads between cores. This adaptive data sliding approach leads to a balanced spread of data on the chip and therefore improves on-chip storage. On-chip data access has been evaluated using an analytical approach. Results show that the extended data sliding increases the global cache hit rate on the chip, especially in the context of juxtaposed hot spots

    Lightweight reconfiguration security services for AXI-based MPSoCs

    No full text
    International audienceNowadays, security is a key constraint in MPSoC development as many critical and secret information can be stored and manipulated within these systems. Addressing the protection issue in an efficient way is challenging as information can leak from many points. However one strategic component of a bus-based MPSoC is the communication architecture as all information that an attacker could try to extract or modify would be visible on the bus. Thus monitoring and controlling communications allows an efficient protection of the whole system. Attacks can be detected and discarded before system corruption. In this work, we propose a lightweight solution to dynamically update hardware firewall enhancements which secure data exchanges in a bus-based MPSoC. It provides a standalone security solution for AXI-based embedded systems where no user intervention is required for security mechanisms update. An FPGA implementation demonstrates an area overhead of around 11% for the adaptive version of the hardware firewall compared to the static one

    Bitstreams Repository Hierarchy for FPGA Partially Reconfigurable Systems

    No full text
    In this paper we present a hierarchy of bitstreams repositories for FPGA-based networked and partially reconfigurable systems. These systems target embedded systems with very scarce hardware resources taking advantage of dynamic, specific and optimized architectures. Based on FPGA integrated circuits, they require a single FPGA with a network controller and less external memories to store reconfiguration software, bitstreams and buffer pools used by today's standard communication protocols. Our measures, based on a real implementation, show that our repository hierarchy is functional and can download bitstreams with a reconfiguration speed ten times faster than known solutions

    Bus-based MPSoC security through communication protection: A latency-efficient alternative

    No full text
    International audienceSecurity in MPSoC is gaining an increasing attention since several years. Digital convergence is one of the numerous reasons explaining such a focus on embedded systems as much sensitive and secret data are now stored, manipulated and exchanged in these systems. Most solutions are currently built at the software level; we believe hardware enhancements also play a major role in system protection. One strategic point is the communication layer as all data goes through it. Monitoring and controlling communications enable to fend off attacks before system corruption. In this work, we propose an efficient solution with several hardware enhancements to secure data exchanges in a bus-based MPSoC. Our approach relies on low complexity distributed firewalls connected to all critical IPs of the system. Designers can deploy different security policies (access right, data format, authentication, confidentiality) in order to protect the system in a flexible way. To illustrate the benefit of such a solution, implementations are discussed for different MPSoCs implemented on Xilinx Virtex-6 FPGAs. Results demonstrate a reduction up to 33% in terms of latency overhead compared to existing efforts

    Security enhancements for FPGA-based MPSoCs: a boot-to-runtime protection flow for an embedded Linux-based system

    No full text
    International audienceNowadays, embedded systems become more and more complex: the hardware/software codesign approach is a method to create such systems in a single chip which can be based on reconfigurable technologies such as FPGAs (Field-Programmable Gate Arrays). In such systems, data exchanges are a key point as they convey critical and confidential information and data are transmitted between several hardware modules and software layers. In case of an FPGA development life cycle, OS (Operating System) / data updates as runtime communications can be done through an insecure link: attackers can use this medium to make the system misbehave (malicious injection) or retrieve bitstream-related information (eavesdropping). Recent works propose solutions to securely boot a bitstream and the associated OS while runtime transactions are not protected. This work proposes a full boot-to-runtime protection flow of an embedded Linux kernel during boot and confidentiality/integrity protection of the external memory containing the kernel and the main application code/data. This work shows that such a solution with hardware components induces an area occupancy of 10% of a xc6vlx240t Virtex-6 FPGA while having an improved throughput for Linux booting and lowlatency security for runtime protection

    A Fast Evaluation Approach of Data Consistency Protocols within a Compilation Toolchain

    Get PDF
    International audienceShared memory is a critical issue for large distributed systems. Despite several data consistency protocols have been proposed, the selection of the protocol that best suits to the application requirements and system constraints remains a challenge. The development of multi-consistency systems, where different protocols can be deployed during runtime, appears to be an interesting alternative. In order to explore the design space of the consistency protocols a fast and accurate method should be used. In this work we rely on a compilation toolchain that transparently handles data consistency decisions for a multi-protocol platform. We focus on the analytical evaluation of the consistency configuration that stands within the optimization loop. We propose to use a TLM NoC simulator to get feedback on expected network contentions. We evaluate the approach using five workloads and three different data consistency protocols. As a result, we are able to obtain a fast and accurate evaluation of the different consistency alternatives
    • 

    corecore